The R-square measures for the M2E models constructed by four algorithms.

ds for the Lasso regression models, Ridge stands for the RLR models, Forest

he random forest models, and SVM stands for the SVM regression models.

ch of four regression models constructed for a DEG, say the gth

MSs employed as the independent variables were ranked. Suppose

DMS was ranked at the top. Its host gene was found and was

by k. Note that the kth gene was not necessarily a DEG. Whether

ne was the host gene for the methylation site of the mth DMS was

ed by the following method. Suppose the start and end base pairs

h gene were denoted by ߴ௞,଴ and ߴ௞,ଵ. The interval was then

s [ߴ௞,଴െ1000, ߴ௞,ଵ൅1000] for searching for the host gene for

ation site. Suppose the base pair of the methylation site of the mth

s denoted as ߱. If ߴ௞,଴െ1000 ൑߱൑ ߴ௞,ଵ൅1000, the kth

the host gene of the methylation site of the mth DMS.

enes were ordered in terms of their sequential occurrence in a

The genes were indexed by their sequential occurrence positions

me. For instance, the first gene was assigned an integer one, the

ene was assigned an integer two, the third gene was assigned an

ree, etc. Therefore, the gth gene was assigned an integer g and the

was assigned an integer k.

distances were calculated. The first distance was named as the

er distance and was the absolute distance between g and k. It must

that the gth gene was a DEG and the kth gene was not necessarily